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Abstract 


Recent  efforts  in  predicting  rocket  propulsion  (RP-1)  fuel  performance  through  modeling 
put  greater  emphasis  on  obtaining  detailed  and  accurate  fuel  properties,  as  well  as  to  elucidate 
the  relationships  between  fuel  composition  and  their  properties.  Herein,  we  study 
multidimensional  chromatographic  data  obtained  utilizing  the  instrumental  platform  that 
included  comprehensive  two-dimensional  gas  chromatography  combined  with  time-of-flight 
mass  spectrometry  (GC  x  GC  -TOFMS)  to  analyze  RP-1  fuels.  For  GC  x  GC  separations, 
RTX-wax  (polar  stationary  phase)  and  RTX- 1  (non-polar  stationary  phase)  columns  were 
implemented  for  the  primary  and  secondary  dimensions,  respectively,  to  separate  the  chemical 
compound  classes  (alkanes,  cycloalkanes,  aromatics,  etc),  providing  a  significant  level  of 
chemical  compositional  information.  The  GC  x  GC  -  TOFMS  data  were  analyzed  using  partial 
least-squares  regression  (PLS)  chemometric  analysis,  specifically  to  model  and  predict  advanced 
distillation  curve  (ADC)  data  for  ten  RP- 1  fuels  that  were  previously  analyzed  using  the  ADC 
method.  The  PLS  modeling  provides  insight  into  the  chemical  species  that  impact  the  observed 
changes  in  the  previously  collected  ADC  data.  The  PLS  modeling  correlates  compositional 
information  found  in  the  GC  x  GC  -  TOFMS  chromatograms  of  each  RP-1  fuel,  and  their 
respective  ADC,  and  allows  prediction  of  the  ADC  for  each  RP-1  fuel  with  good  precision  and 
accuracy.  The  predictive  power  of  the  overall  method  via  PLS  modeling  was  assessed  using 
leave-one-out  cross-validation  (LOOCV)  yielding  root-mean-square  error  of  cross-validation 
(RMSECV)  with  low  values,  typically  below  2.0  °C,  at  each  %  distilled  measurement  point 
during  the  ADC  analysis. 

Keywords:  GC  x  GC  -  TOFMS,  partial  least  squares  (PLS)  analysis,  advanced  distillation  curve 
(ADC),  two-dimensional,  gas  chromatography,  RP-1  fuel. 
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Introduction 


The  chemical  composition  of  a  kerosene  fuel,  though  complex,  holds  a  key  to 
understanding  and  altering  the  physical  properties  and  performance  of  the  fuel  [1-7].  Achieving 
fine  control  over  the  chemical  composition  can  be  a  difficult  task.  It  has  become  increasingly 
important  to  achieve  further  insight  into  fuel  composition,  as  well  as  the  sources  of  variation  in 
the  fuel  composition  to  both  maintain  and  control  fuel  performance,  as  well  as  to  assess  the 
performance  of  “field”  fuels  [1-5].  Fuel  performance  is  inextricably  tied  to  characterization,  and 
the  advanced  distillation  curve  (ADC)  method  has  demonstrated  itself  as  a  well  suited  approach 
for  the  analysis  and  characterization  of  complex  fuels  [8-10].  The  ADC  method  is  a  state-of-the- 
art  approach  to  very  accurately  and  precisely  analyze  the  boiling  curve  of  complex  liquids. 
Samples  (i.e.,  distillation  fractions)  may  be  obtained  during  the  distillation,  and  can  be  further 
analyzed  both  qualitatively  and  quantitatively. 

The  ADC  method  was  pioneered  by  Bruno  and  co-workers  [2,  5,  8-20].  Briefly,  the 
apparatus  for  the  ADC  method  utilizes  a  round-bottom  flask  connected  to  an  air  cooled 
condenser,  a  receiver  adapter  and  a  calibrated  volume  receiver.  The  flask  is  encased  with  a 
heater  in  an  aluminum  jacket.  Inside  the  flask  are  two  thermocouples,  suspended  using  a 
centering  adapter:  one  thermocouple  measures  the  temperature  of  the  liquid  analyzed,  and  the 
other  thermocouple  measures  the  temperature  in  the  headspace  above  the  liquid  being  distilled. 
Three  bore  scope  ports  are  strategically  located  to  inspect  both  the  liquid  and  the  thermocouples 
inside  the  apparatus.  The  flask  is  connected  to  an  air  cooled  condenser  chilled  with  a  vortex 
tube,  wherein  the  distillate  condenses.  The  condenser,  in  turn,  is  connected  to  a  special  adapter 
where  the  drops  of  distillate  fall  into  a  small  0.05  mL  “hammock.”  With  the  use  of  a  syringe,  the 
distillate  may  be  sampled  from  the  hammock  for  further  analysis  including,  but  not  limited  to, 
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gas  chromatography  (GC)  [9-15],  infrared  spectroscopy  [12],  and  measurements  of  enthalpy  of 
combustion  [14].  After  the  adapter,  the  distillate  reaches  the  calibrated  volume  receiver.  More 
recently,  a  variation  of  the  ADC  method  apparatus  was  implemented  that  controls  the  internal 
pressure,  preventing  sample  degradation  due  to  reactions  that  may  potentially  occur  at  high 
temperatures  when  analyzing  samples  containing  low-volatility  compounds  [17].  This  feature 
was  achieved  by  sealing  every  connection  between  parts  of  the  apparatus  and  using  a  commercial 
pressure  controller.  Sampling  is  perfonned  with  a  reduced  pressure  balance  syringe. 

The  ADC  method  has  been  instrumental  in  the  study  of  a  variety  of  complex  liquid 
samples  including,  but  not  limited  to,  crude  oil  [12],  gasoline  [16],  biodiesel  fuel  [17,  19],  jet  fuel 
[5,  10,  11],  motor  oil  [18],  and  rocket  propellant  (RP)  [2,  9,  10,  13-15,  20].  The  ADC  method 
can  be  used  not  only  to  provide  information  regarding  sample  composition,  but  also  to  study  the 
thennodynamic  and  physical  properties,  chemical  properties  such  as  corrosive  effects  [12], 
enthalpy  of  combustion  (through  the  use  of  each  distillate  fraction  to  detennine  the  overall 
enthalpy  of  combustion)  [5,  10,  11,  15-16],  and  the  influence  of  thermal  stress  on  fuels  [15]. 
Furthermore,  the  variability  in  fuel  composition  and  its  impact  on  thermophysical  properties 
have  also  been  investigated  [20]. 

In  conjunction  with  implementing  the  ADC  method,  it  has  become  apparent  that 
additional  chemical  composition  information  should  be  evaluated  to  strengthen  and  ultimately 
apply  the  information  gained  from  ADC  data.  For  this  purpose,  in  this  report  we  applied  the 
powerful  chemical  analysis  tool  known  as  comprehensive  two-dimensional  gas  chromatography 
combined  with  time-of-flight  mass  spectrometry  (GC  x  GC  -  TOFMS),  using  a  reverse  column 
GC  x  GC  configuration  (i.e.,  polar  primary  dimension  column  coupled  with  a  non-polar 
secondary  dimension  column)  [21]  building  from  our  previous  study  [22],  to  improve  the 
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separation  of  the  various  compound  classes  (eg.  alkanes,  cycloalkanes,  aromatics,  etc),  and  to 
facilitate  extraction  of  chemical  information  from  a  set  of  ten  RP-1  fuel  samples.  Using 
chemometrics,  we  then  explored  the  connection  between  chemical  composition  via  GC  x  GC  - 
TOFMS  chromatographic  data  and  the  ADC  data  from  the  RP-1  fuels.  Indeed,  GC  x  GC  - 
TOFMS  is  ideally  suited  for  use  in  fuels  analysis  [21-31]. 

To  help  glean  useful  information,  multivariate  “chemometric”  data  analysis  methods 
have  been  developed.  Chemometrics  have  been  shown  to  be  able  to  take  advantage  of  the 
three-way  data  provided  by  the  GC  x  GC  -  TOFMS  instrumental  platform,  to  help  reveal 
similarities  and/or  differences  between  chromatograms  [22-25,  32].  Partial  least-squares  (PLS) 
analysis  can  be  used  to  associate  variance  in  fuel  composition  to  measured  physical  properties 
[22].  Detailed  information  on  the  theory  of  PLS  can  be  found  elsewhere  [34-36].  In  this  study, 
GC  x  GC  -  TOFMS  chromatographic  data  of  RP-1  fuels  and  their  respective  ADC  data  are 
analyzed  using  PLS  to  provide  useful  information  on  chemical  compounds  that  significantly 
influence  the  RP- 1  fuel  properties  via  inspection  of  the  linear  regression  vector  (LRV)  of  each 
PLS  model.  This  analysis  is  accomplished  by  selecting  an  appropriate  number  of  latent  variables 
(LVs)  that  are  used  to  calculate  loadings  that  capture  the  variance  (i.e.  chemical  information)  in 
the  GC  x  GC  -  TOFMS  chromatograms  that  have  the  maximum  covariance  with  corresponding 
information  in  the  ADC  data  set.  Our  goals  are  to  demonstrate  and  validate  the  use  of  PLS 
modeling,  and  to  relate  chemical  information  obtained  from  the  GC  x  GC  -  TOFMS 
chromatograms  to  the  corresponding  ADC  for  each  RP-1  fuel,  and  ultimately  to  predict  the  ADC 
temperatures  of  a  given  RP-1  fuel,  without  directly  making  those  measurements  [2].  This 
chemical  analysis  approach  has  the  ability  to  provide  insight  into  the  chemical  composition 
changes  as  a  function  of  %  distilled  (and  distillation  temperature  during  the  ADC  experiment). 
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Eventually,  this  chemical  analysis  approach  will  provide  insight,  and  to  aid,  in  the  process  of 
optimization  of  fuel  performance. 

Experimental 

GC  x  GC  -  TOFMS  data  collection 

The  full  details  on  the  GC  x  GC  -  TOFMS  instrumental  platform  and  methodology  can 
be  found  in  our  previous  report  [22].  Ten  RP-1  fuel  samples  were  obtained  from  the  Air  Force 
Research  Faboratory  (AFRF),  Edwards  AFB,  CA,  and  are  listed  in  Table.  1.  The  ADC  data 
were  obtained  from  an  earlier  report  [2],  The  GC  x  GC  -  TOFMS  instrument  used  was  an 
Agilent  6890A  GC  with  a  7683B  auto-injector  (Agilent  Technologies,  Palo  Alto,  CA,  USA) 
coupled  to  a  FECO  Pegasus-III  TOFMS  (FECO,  St.  Joseph,  MI,  USA).  Isobaric  mode  was  used 
with  an  inlet  pressure  of  35  psig  (241  kPa).  The  auto-injector  was  set  to  1  pF  injection,  a  200: 1 
split  injection  with  helium  carrier  gas  was  used,  and  acetone  was  used  as  the  solvent  rinse.  The 
first  GC  x  GC  separation  dimension  (primary  column)  used  a  RTX-wax  (polar)  stationary  phase, 
of  30  m  in  length,  250  pm  i.d.,  and  a  0.5  pm  film.  The  modulation  period  was  set  to  2.5  s.  The 
second  separation  dimension  (secondary  column)  used  a  1.2  m  RTX-1,  of  100  pm  i.d.,  and  a  0.18 
pm  film.  The  GC  oven  was  initially  set  to  40  °C  for  2  min  and  ramped  to  225  °C  at  a  rate  of  6 
°C/min;  the  final  temperature  was  maintained  for  3  min.  The  GC  inlet  was  set  to  225  °C  and  the 
transfer  line  temperature  was  235  °C.  The  thermal  modulator  offset  was  20  °C,  with  a  hot  pulse 
time  of  0.59  s  and  a  0.35  s  cool  time.  The  secondary  column  oven  temperature  control  was  not 
used  while  still  achieving  a  suitable  GC  x  GC  separation,  and  the  secondary  oven  (housed  in  the 
primary  oven)  was  left  open  and  set  at  the  same  nominal  temperature  as  the  primary  oven.  The 
TOFMS  data  acquisition  parameters  were  set  with  a  120  s  acquisition  delay,  a  mass  channel 
(m/z)  scan  range  of  35-334  amu,  with  a  100  Hz  acquisition  rate. 
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Data  analysis:  PLS  of  GC  x  GC  -  TOFMS  data 


The  computer  used  for  analysis  was  an  Intel  Core  i-3-2120  @3.3  GHz,  with  16.0  GB  of 
RAM,  and  included  a  60  GB  SSD  drive  used  for  the  purpose  of  a  page  disc  (“fast”  virtual  RAM). 
Two  replicate  sets  of  RP-1  GC  x  GC  -  TOFMS  chromatograms  were  analyzed  as  separate  sets  of 
PLS  models  as  described  below,  and  the  results  for  both  replicates  are  provided  herein,  overlaid 
in  figures,  similar  to  previous  reports  [22,  30].  Chromatographic  runs  were  imported  to 
MATLAB2009b  (MathWorks,  Natick  MA)  using  the  ‘peg2maf  function  [22,  37-38]. 

The  GC  x  GC  -  TOFMS  data  underwent  baseline  correction  using  in-house  software  as  reported 
previously  [22],  and  to  help  save  memory  and  computation  time,  the  data  also  underwent  a 
condensing  procedure  [22,  39]  that  included  the  following  operations.  First,  the 
chromatographic  data  were  binned  (for  2  points  in  each  chromatographic  dimension,  resulting  in 
GC  x  GC  -  TOFMS  chromatograms  that  are  25%  of  their  original  size).  The  binning  also 
addressed  any  minor  run-to-run  misalignment  in  the  data  [39].  Second,  in  the  TOFMS  domain, 
omitting  m/z  channels  that  were  unselective  and  m/z  channels  that  do  not  exhibit  signal  greater 
than  five  times  the  standard  deviation  of  baseline  corrected  noise  (these  m/z  are:  35-37,43-47,  51, 
58-62,  73-76,  87-90,  101-103,  115-118,  133,  207,  214-334).  Third,  the  signal  for  uninformative 
temporal  regions  was  set  to  0,  specifically,  GC  x  GC  regions  dominated  by  column  bleed  or  with 
no  analyte  compound  signal  (these  regions  were  initially  inspected  while  taking  chromatogram 
variability  into  consideration  to  prevent  the  chance  of  removing  compositional  variation).  The 
chromatographic  and  mass  spectral  dimensions  of  the  GC  x  GC  -  TOFMS  data  for  each  RP-1 
fuel  was  vectorized  (from  10  fuels  xl25  secondary  column  data  points  x405  primary  column 
data  points  x  148  mass  channels  to  10  fuels  x  7,492,500  unfolded  data  points)  prior  to  PLS 
analysis  along  with  the  ADC  (in  vector  form)  for  each  RP-1  fuel.  PLS  analysis  was  performed 
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using  PLS  Toolbox  6.7  (Eigenvector  Research  Inc.,  Wenatchee  WA),  with  mean  centering  of  the 
GC  x  GC  -  TOFMS  data  and  auto  scaling  (subtracting  the  mean  and  dividing  by  the  standard 
deviation)  for  the  ADC  temperature  values. 

Data  analysis:  PLS  of  ADC  data 

Using  the  ADC  method  for  a  RP-1  fuel  analysis,  the  temperature  is  recorded  at  the 
moment  a  specific  percentage  of  the  fuel  has  been  distilled  (%  distilled  point)  [2].  For  this  study, 
temperatures  for  the  ADC  method  were  measured  at  nineteen  %  distilled  points:  0.025,  5,  10,  15, 
20,  25,  30,  35,  40,  45,  50,  55,  60,  65,  70,  75,  80,  85,  and  90  [2].  Rather  than  construct  a  single 
PFS  model  for  the  entire  ADC  data  set  (simultaneously  on  all  nineteen  measurement  points 
along  the  %  distilled  axis  of  the  ADC  for  all  fuels  in  the  sample  set),  a  series  of  19  PFS  models 
(a  PLS  model  at  each  %  distilled  point)  were  produced.  Performing  the  PLS  analysis  using  a 
series  of  19  models  offered  several  key  advantages.  First,  this  approach  lessened  the  restrictions 
on  PLS  when  constructing  the  model(s).  Second,  this  approach  offered  the  ability  to  change  the 
number  of  LVs  at  different  %  distilled  points  in  the  ADC  (if  necessary).  Different  numbers  of 
LVs  can  be  expected  because  the  composition  of  a  fuel  is  known  to  change  over  the  course  of  the 
distillation,  i.e.  the  GC  x  GC  -  TOFMS  chromatographic  data  represents  the  initial  chemical 
composition  of  a  given  fuel,  however  the  composition  at  a  given  %  distilled  is  a  subset  of  this 
composition,  with  possibly  different  relative  concentrations  for  the  various  compounds  present. 

A  third  important  advantage  for  constructing  a  series  of  19  PLS  models  was  to  save  computation 
time.  Consider  modeling  the  entire  ADC  data  set  (10  fuels  x  19  %  distilled  points)  coupled  with 
the  unfolded  GC  x  GC  -  TOFMS  chromatograms  (as  stated  previously,  7,492,500  unfolded  data 
points  per  fuel):  PLS  would  require  a  considerable  amount  of  computer  memory  (about  13  GB), 
and  the  computation  time  would  be  prohibitively  long,  and  on  some  computer  systems  this 
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computational  exercise  would  fail  due  to  memory  constraints.  In  contrast,  applying  PLS  on  the 
unfolded  GC  x  GC  -  TOFMS  chromatograms  at  one  %  distilled  point  at  a  time  required  fewer 
LVs  and  significantly  less  memory  (around  6.5GB),  and  required  less  than  a  minute  to  compute 
per  PLS  model. 

PLS  modeling  of  GC  x  GC  -  TOFMS  and  ADC  data 

The  PLS  modeling  was  validated  using  leave-one-out-cross-validation  (LOOCV). 
Briefly,  LOOCV  involves  a  series  of  PLS  models  from  («-l)  samples  from  the  original  n  sample 
data  set,  using  the  nlh  sample  to  predict  values  from  the  constructed  (n- 1)  model.  After  all 
combinations  are  analyzed  the  root-mean-square  of  error  of  cross-validation  of  the  residuals  of 
the  PLS  models  (RMSECV)  was  calculated  [40]: 


RMSECV 


(1) 


Moreover,  RMSECV  results  were  also  used  to  help  determine  the  most  appropriate  number  of 
LVs  to  use  for  the  PLS  models. 

At  each  step  in  the  analytical  procedure,  the  LRVs  of  the  PLS  models  were  inspected  to 
qualitatively  verify  that  the  connections  the  PLS  models  made  between  the  chromatographic 
information  (GC  x  GC  -  TOFMS  data)  and  physical  measurements  (ADC  data)  were  both 
logical,  and  that  the  LRVs  from  consecutive  models  appear  continuous.  Using  information 
provided  by  the  LRVs,  identification  of  compounds  of  interest  in  the  GC  x  GC  -  TOFMS  data 
was  performed  via  ChromaTOF  V.3.32  (LECO  Corporation,  St.  Joseph,  MI,  USA),  and  in-house 
software  for  nontarget  PARAFAC  for  well  resolved  and  unresolved  peaks,  respectively  [26]. 

The  NIST  1 1  V2.0g  mass  spectral  library  (National  Institute  of  Standards  and  Technology, 
Boulder  CO,  USA)  was  used  for  mass  spectral  identification. 
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Results  and  discussion 


A  representative  GC  x  GC  -  TOFMS  chromatogram  of  an  RP-1  fuel  is  provided  in 
Fig.  la.  In  this  figure  the  total  ion  current  (TIC)  signal  is  plotted  for  the  GC  x  GC  separation  of 
the  RP-1  fuel  LB073009-08.  To  further  demonstrate  the  separation  power  for  complex  samples 
such  as  RP-1,  in  Fig.  lb,  c  and  d,  are  provided  specific  regions  of  the  GC  x  GC  separation  with 
a  representative  alkane,  cycloalkane,  and  aromatic  compound  indicated,  respectively.  Each  of 
the  representative  compounds  indicated  also  are  key  compounds  identified  in  the  PLS  modeling 
that  will  be  presented  herein.  In  Fig.  lb  is  provided  a  region  of  Fig.  la  at  the  selective  mass 
channel  m/z  57;  the  highlighted  peak  (located  at  8.75  min  and  1.94  s  on  the  primary  and 
secondary  dimensions,  respectively)  has  been  identified  as  decane.  In  Fig.  lc  is  provided  a 
region  of  Fig.  la  at  the  selective  mass  channel  m/z  136;  the  highlighted  peak  (located  at  15.00 
min  and  1 . 1 7  s  on  the  primary  and  secondary  dimensions,  respectively)  has  been  identified  as  the 
adamantane.  Finally,  in  Fig.  Id  is  provided  a  region  of  Fig.  la  at  the  selective  mass  channel  m/z 
105;  the  highlighted  peak  (located  at  19.29  min  and  0.95  s  on  the  primary  and  secondary 
dimensions,  respectively)  has  been  identified  as  methylbutylbenzene. 

The  previously  measured  ADC  data  for  all  ten  RP-1  fuels  are  provided  in  Fig.  2a  [2]. 

The  measured  ADC  data  were  obtained  at  a  %  distilled  range  from  0.025%  to  90%.  The 
recorded  temperatures  for  the  ADC  data  set  range  from  207. 2°C  to  213.5°C  at  0.025%  distilled, 
to  235. 9°C  to  258. 1°C  at  90%  distilled.  At  various  %  distilled  values  the  ADC  for  several  fuel 
pairs  cross  one  another,  which  may  potentially  make  the  PLS  modeling  of  ADC  data  more 
challenging.  For  clarity,  in  Fig.  2b  two  representative  ADCs  are  provided  that  approximately 
span  the  range  of  temperatures  at  each  %  distilled.  In  Fig.  2b,  RP-1  fuel  LB073009-06 
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represents  the  highest  measured  temperatures  for  the  ten  fuels,  while  RP-1  fuel  LB073009-02 
exhibited  some  of  the  lowest  recorded  temperatures. 

For  comparison  to  Fig  2b,  the  ADCs  for  RP-1  fuels  LB073009-06  and  LB073009-02 
predicted  using  PLS  during  the  LOOCV  procedure  are  provided  in  Fig.  2c.  Figs.  2b  and  2c  are 
qualitatively  very  similar  indicating  the  ability  of  the  PLS  models  to  accurately  predict  fuel 
physical  data,  but  in  order  to  obtain  a  more  quantitative  evaluation  of  the  accuracy  of  the  PLS 
modeling,  residuals  for  each  ADC  were  calculated  at  each  %  distilled  value.  The  ADC  residuals 
were  obtained  by  subtracting  a  measured  ADC  from  the  ADC  predicted  using  PLS.  The 
residuals  imply  an  accuracy  of  the  PLS  modeling  to  within  +/-  2.5  °C  range,  which  is  deemed 
reasonable  for  this  initial  study. 

Examination  of  the  LRVs  of  the  PLS  models  provide  additional  information, 
complementary  to  the  ADCs  predicted  from  the  PLS  models.  In  Fig.  3a-c,  three  of  the  nineteen 
LRVs  are  provided  (one  for  each  PLS  model  constructed,  other  LRVs  omitted  for  brevity):  one 
LRV  from  the  beginning  (0.025%  distilled),  middle  (45%  distilled),  and  end  (90%  distilled)  of 
the  ADC.  Through  inspection  of  the  positive  LRV  values,  the  corresponding  peaks  tend  to  be 
analyte  compounds  eluting  after  ~10  min  for  alkanes,  after  ~15  min  for  the  cycloalkanes,  and  di- 
and  tri-cycloalkanes,  and  after  ~17  min  for  aromatic  groups  to  a  lesser  extent.  These  results  in 
the  LRVs  display  a  general  pattern  that  the  less  volatile  compounds  contribute  positively  to  an 
ADC,  suggesting  less  volatile  compounds  increase  the  overall  predicted  temperature  of  the  ADC 
at  a  given  %  distilled  point.  As  the  %  distilled  approaches  90%,  the  intensities  of  the  positively 
contributing  peaks  in  the  LRVs  shift  to  the  right  to  less  volatile  compounds,  suggesting  these 
compounds  may  contribute  more  with  respect  to  the  predicted  ADC  temperature.  An  interesting 
observation  is  that  some  regions  (and  peaks  therein)  in  the  LRVs  change  sign  as  the  distillation 
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runs  toward  completion;  a  good  example  is  a  cluster  of  peaks  located  -13  and  17  min  in  the 
primary  separation  dimension  and  ~1.2  and  1.5  s  in  the  secondary  separation  dimension. 
Although  the  peaks  in  the  LRVs  in  this  separation  region  are  generally  positive  at  0.025% 
distilled,  as  the  distillation  progresses  the  magnitude  of  many  peaks  diminish  until  their 
contribution  is  zero,  then  as  the  distillation  progresses  further  the  signs  of  these  peaks  change  to 
negative  with  a  corresponding  increase  in  magnitude.  This  suggests  that  early  in  the  distillation, 
analyte  compounds  corresponding  to  peaks  in  the  LRV  that  are  changing  from  positive  to 
negative  during  the  distillation  would  contribute  to  increasing  the  predicted  ADC  temperature, 
but  approaching  the  end  of  the  distillation  these  compounds  would  contribute  to  decreasing  the 
predicted  ADC  temperature.  These  compounds  seem  to  act  analogous  to  a  chemical  buffer  in 
that  as  buffers  moderate  changes  in  pH,  these  compounds  moderate  the  temperature  range  of  the 
distillation,  i.e.  the  more  of  these  compounds  present  the  narrower  the  temperature  range  over 
which  the  distillation  will  occur. 

An  interesting  phenomenon  is  observed  at  the  higher  %  distilled  values,  as  shown  in  Fig. 
3c.  There  are  several  unexpected,  slightly  positive  peaks  in  the  LRV  region  between  5  and  15 
min.  At  90%  distilled  the  chemical  composition  of  the  fuels  is  actually  a  subset  of  the  fuel 
composition  that  is  analyzed  by  the  GC  x  GC  -  TOFMS  instrument,  since  at  90%  distilled  the 
more  volatile  compounds  will  have  mostly  boiled  off,  and  there  likely  have  been  some 
significant  changes  in  the  relative  compositions  of  the  various  compounds  in  the  fuels.  Thus,  the 
positive  value  peaks  in  the  LRVs  in  the  region  between  5  and  15  min  may  be  attributed  to 
covariance  between  compounds  that  are  more  volatile  and  compounds  that  are  less  volatile  in  the 
PLS  models  (due  to  inherent  similarities  of  the  RP-1  lab  blends),  and  not  necessarily  because 
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these  LRV  peaks  are  chemically  meaningful;  this  may  lead  to  a  higher  source  of  error  in  PLS 
models  at  higher  %  distilled  values. 

Inspection  of  the  negative  regions  (and  peaks  therein)  in  all  of  the  LRVs,  analyte 
compounds  between  5  and  15  min  generally  have  negative  values,  suggesting  the  earlier  and 
more  volatile  compounds  lower  the  overall  temperature  of  the  ADC  at  a  given  %  distilled.  As 
with  the  positive  LRV  values,  as  the  distillation  progresses  from  0.025%  to  90%  distilled,  the 
intensity  shifts  from  left  to  right.  As  the  temperature  rises,  the  more  volatile  compounds 
preferentially  evaporate,  so  their  decreased  presence  reduces  their  influence  on  the  overall 
temperature  at  higher  %  distilled  values,  while  the  heavier,  less  volatile  compounds  contribute 
more.  A  list  of  representative,  yet  key,  analyte  compounds  of  interest,  indicated  by  large  peak 
magnitudes  in  the  LRVs  were  identified  and  summarized  in  Tables  2,  3,  and  4.  For  example, 
methylbutylbenzene  (identified  in  Fig.  Id)  is  listed  in  Table  2,  and  is  one  of  the  major  positively 
contributing  compounds  to  the  LRV.  Decane  (identified  in  Fig.  lb),  is  listed  in  Table  3,  and  is 
one  of  the  major  negatively  contributing  compounds  to  the  LRV.  Adamantane  (identified  in  Fig. 
lc)  in  Table  4  is  one  of  the  significant  compounds  that  change  sign  with  respect  to  their 
contribution  as  the  ADC  nears  completion.  Identification  of  compounds  that  impact  the  ADC 
can  play  an  important  role  in  understanding  the  information  provided  by  the  ADC  experiment, 
and  ultimately  could  play  a  key  role  in  improving  fuel  formulation  and  perfonnance. 

Finally,  we  present  the  LOOCV  summary  using  the  RMSECV  calculation  defined  in  Eq. 
(1)  as  a  function  of  the  %  distilled  value.  The  LOOCV  procedure  for  the  PLS  modeling  was 
performed  using  both  sets  of  GC  x  GC  -  TOFMS  data  with  the  ADC  data  set.  The  most 
appropriate  number  of  latent  variables  (LVs)  was  determined  to  be  4,  based  upon  the  analysis  of 
scree  plots  [22].  The  LOOCV  summary  in  Fig.  4  provides  an  assessment  of  the  accuracy  of  the 
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PLS  models.  The  residuals  (Fig.  2d)  of  many  of  the  RP-1  fuels  cross  at  80%  distilled  along  with 
a  sharp  increase  in  the  RMSECV  in  Fig.  4  (at  85%  and  90%  distilled).  These  changes  are  linked 
to  the  changes  in  fuel  composition  as  more  fuel  is  distilled  and  the  resulting  covariance  between 
compounds  of  different  volatility  that  appear  in  the  chromatograms.  In  principle,  distillate 
fractions  of  RP- 1  fuels  could  be  collected  at  each  %  distilled  and  analyzed  with  the  GC  x  GC  - 
TOFMS,  and  the  resulting  chromatograms  could  be  used  to  construct  the  PLS  models  using  their 
respective  temperatures  on  the  ADC  data.  However,  this  approach  is  more  laborious  and 
impractical,  requiring  a  prohibitively  large  set  of  samples,  e.g.,  190  samples,  from  10  fuels  x  19 
ADC  %  distilled  points  (instead  of  only  10  fuel  samples  directly  analyzed  herein  in  order  to 
demonstrate  the  methodology  principles).  The  primary  benefit  of  collecting  and  analyzing 
distillate  fractions  at  each  %  distilled  value  would  be  to  reduce  the  apparent  covariance,  thus 
making  the  RMSECV  values  (in  Fig.  4)  consistently  smaller  across  the  ADC.  Another  way  to 
think  about  this  source  of  the  error  while  approaching  the  end  of  the  distillation  is  that  PLS  is 
using  the  chromatograms  of  un-distilled  RP-1  fuels  to  “predict  the  future”  ADC  values.  It  is 
likely  that  better  PLS  models  could  be  constructed  from  chromatograms  generated  from  the  RP-1 
fuels  sampled  at  each  %  distilled.  Using  a  respective  chromatogram  of  a  fuel  at  each  distillation 
point  would  have  been  more  representative  of  the  fuel  and  would  have  helped  minimize  the  error 
of  the  PLS  models  However,  obtaining  said  RP-1  samples  at  various  stages  of  distillation  poses  a 
significantly  more  laborious  proposition. 

Conclusions 

In  this  report  we  have  demonstrated  the  use  of  PLS  on  GC  x  GC  -  TOFMS 
chromatograms  of  RP-1  fuels,  and  their  respective  ADCs.  The  PLS  modeling  provides  insight 
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into  how  the  chemical  composition  weighs  differently  in  determining  the  temperature  for  a  given 
%  distilled  value  across  the  ADC.  Compounds  were  discovered  that  correlate  with  narrowing 
the  temperature  range  of  which  the  distillation  occurs.  The  predictive  power  of  the  PLS 
modeling  assessed  using  LOOCV  was  found  to  be  extremely  powerful,  yielding  RMSECV  with 
low  values,  typically  below  2.0  °C,  at  each  %  distilled  measurement  point  during  the  ADC 
analysis.  This  outcome  bodes  well  for  potential  future  studies  with  expanded  fuel  sample  sets. 
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Tables 

Table  1.  RP-1  Fuel  Set  [22]. 


Sample  number 

NIST  Number  [1-2] 

AFRL  Designation 

1 

11 

LB080409-01 

2 

10 

LB073009-06 

3 

9 

LB073009-08 

4 

8 

LB080409-05 

5 

7 

LB073009-05 

6 

5 

LB073009-01 

7 

4 

LB073009-09 

8 

1 

LB073009-02 

9 

2 

LB073009-03 

10 

3 

XC2521HW10 
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Table  2.  Major  contributing  compounds  identified  in  the  LRVs  that  contribute  positively,  per  the 
blue  features  in  Fig.  3a,  b,  c.  The  retention  time  on  the  primary  column  is  labeled  V,  and  on  the 
secondary  column  as  2tR.  The  mass  spectral  match  value  is  labeled  MV. 


# 

Compound  Identification 

(min) 

2tR  (s) 

MV 

Compound  Class 

1 

Trimethyldodecane  (C15H32) 

17.42 

2.23 

924 

alkanes 

2 

3-Methyltridecane  (C14H30) 

19.88 

2.00 

910 

alkanes 

3 

3-Methyltetradecane  (C15H32) 

20.17 

1.96 

922 

alkanes 

4 

Heptylcyclohexane  (C13H26) 

18.75 

1.62 

889 

cycloalkanes 

5 

Octylcyclohexane  (C14H28) 

21.25 

1.60 

909 

cycloalkanes 

6 

Nonylcyclohexane  (C15H30) 

23.63 

1.60 

929 

cycloalkanes 

7 

Methyl-bicyclohexyl  (C13H24) 

20.79 

1.36 

841 

di-  &  tri-  cycloalkanes 

8 

Hexamethyloctahydro- 1  H-indene  (C 1 5H28) 

22.21 

1.43 

832 

di-  &  tri-  cycloalkanes 

9 

Bicyclohexane  (C15H28) 

20.00 

1.38 

907 

di-  &  tri-  cycloalkanes 

10 

Methylbutylbenzene  (Cl  1H16) 

19.29 

0.95 

908 

mono-aromatics 

11 

Azulene  (C10H8) 

26.83 

0.76 

919 

di-aromatics 

Table  3.  Major  contributing  compounds  identified  in  the  LRVs  that  contribute  negatively  per  the 
red  features  in  Fig.  3a,  b,  c. 


# 

Compound  Identification 

hR  (min) 

Vs) 

MV 

Compound  Class 

1 

Methylnonane  (C 1 0H22) 

7.46 

1.94 

937 

alkanes 

2 

Decane  (C10H22) 

8.21 

1.94 

960 

alkanes 

3 

Dimethylnonane  (Cl  1H24) 

8.42 

2.18 

931 

alkanes 

4 

Trimethylcyclohexane  (C9H18) 

7.42 

1.40 

943 

cycloalkanes 

5 

Methylpropylcyclohexane  (C 1 0H20) 

8.92 

1.63 

873 

cycloalkanes 

6 

Ethyldimethylcyclohexane  (C 1 0H20) 

9.21 

1.53 

864 

cycloalkanes 

7 

cis-Octahydro-1  H-indene  (C9H16) 

11.17 

1.24 

948 

di-  &  tri-  cycloalkanes 

8 

Dimethylbicyclo[3 .2. 1  joctane  (C 1  OH  1 8) 

11.96 

1.34 

890 

di-  &  tri-  cycloalkanes 

9 

Not  found  at  significant  level 

mono-aromatics 

10 

Not  found  at  significant  level 

di-aromatics 
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539 

540 

541 

542 

543 

544 

545 

546 

547 

548 

549 

550 

551 

552 

553 

554 

555 

556 

557 

558 

559 

560 

561 

562 

563 

564 

565 

566 

567 

568 

569 


Table  4.  Compounds  of  interest  identified  in  the  LRVs  that  exhibit  a  sign  change  across  the  ADC 
(from  positive  to  negative),  per  Fig.  3. 


# 

Compound  Identication 

'tR  (min) 

\  (s) 

MV 

Compound  Class 

1 

Trimethyldecane  (C13H28) 

14.83 

2.24 

898 

alkanes 

2 

Methyldodecane  isomer  (C13H28) 

14.67 

2.10 

926 

alkanes 

3 

Methyldodecane  isomer  (C13H28) 

15.00 

2.10 

940 

alkanes 

4 

Not  found  at  significant  level 

cycloalkanes 

5 

Not  found  at  significant  level 

cycloalkanes 

6 

trans-decahydronaphthalene(C  1  OH  1 8) 

12.83 

1.35 

930 

di-  &  tri-  cycloalkanes 

7 

Adamantane  (Cl OH  16) 

15.00 

1.17 

959 

di-  &  tri-  cycloalkanes 

8 

Methyldecahydronaphthalene  (Cl  1H20) 

14.00 

1.42 

940 

di-  &  tri-  cycloalkanes 

9 

Not  found  at  significant  level 

mono-aromatics 

10 

Not  found  at  significant  level 

di-aromatics 

Figure  Captions 

Fig.  1  (a)  Total  ion  current  (TIC)  chromatogram  of  the  RP-1  fuel  LB073009-08,  collected  using 
GC  x  GC  -  TOFMS.  Compound  classes  are  indicated,  (b)  Region  between  5  min  and  12  min  in 
the  primary  dimension  and  1.7  s  and  2.5  s  in  the  secondary  dimension  at  m/z  57,  the  upper  left 
box  in  (a),  with  n-decane  identified,  (c)  Region  between  13  min  and  19  min  in  the  primary 
dimension  and  1.0  s  and  1.8  s  in  the  secondary  dimension  at  m/z  136,  the  middle  box  in  (a),  with 
adamantane  identified,  (d)  Region  between  18  min  and  24  min  in  the  primary  dimension  and  0.8 
s  and  1.2  s  in  the  secondary  dimension  at  m/z  105,  the  lower  right  box  in  (a),  with 
methylbutylbenzene  identified. 

Fig.  2  (a)  Measured  ADC  data  for  the  ten  RP-1  fuels  (listed  in  Table  1)  are  provided,  (b)  The 
ADC  of  two  RP-1  fuels  are  provided  that  span  the  approximate  range  of  the  ADC  data  set:  top 
LB073009-06,  bottom  LB073009-02.  (c)  The  PLS  modeled  ADC  for  the  two  fuels  in  part  (b) 
are  provided:  top  LB073009-06,  bottom  LB073009-02.  (d)  The  ADC  residuals  for  all  ten  of  the 
RP-1  fuels,  calculated  as  the  predicted  ADC  obtained  from  the  cross  validation  predicted  PLS 
models  minus  the  measured  ADC. 

Fig.  3  (a)  Linear  regression  vector  (LRV)  of  a  4LV  PLS  model  at  0.025%  distilled  of  the  ADC, 
with  blue  indicating  a  positive  contribution  to  the  LRV  and  red  indicating  a  negative 
contribution,  (b)  LRV  of  a  4LV  PLS  model  at  45%  distilled  (the  middle)  of  the  ADC.  (c)  LRV 
of  a  4LV  PLS  model  at  90%  distilled  (the  end)  of  the  ADC. 

Fig.  4  Validation  results  are  provided  for  the  PLS  models  of  the  ADCs  for  the  ten  RP-1  fuels  in 
Table  1  using  LOOCV.  The  RMSECV  values  for  PLS  modeling  of  both  sets  of  GC  x  GC  - 
TOFMS  data  are  indicated  as  a  function  of  %  distilled. 
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